Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training data, which is then vulnerable to leakage and extraction by adversaries. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text, while varying other factors such as model size and adversarial conditions. We test both "heuristic" mitigations (those without formal privacy guarantees) and Differentially Private training, which provides provable levels of privacy at the cost of some model performance. Our experiments show that (with the exception of L2 regularization), heuristic mitigations are largely ineffective in preventing memorization in our test suite, possibly because they make too strong of assumptions about the characteristics that define "sensitive" or "private" text. In contrast, Differential Privacy reliably prevents memorization in our experiments, despite its computational and model-performance costs.
translated by 谷歌翻译
大型预审慎的模型可以私下微调以实现非私有模型的性能。这些结果中的一个共同主题是令人惊讶的观察结果,即高维模型可以实现有利的隐私性权衡。这似乎与差异私有凸学习的模型尺寸依赖性相矛盾,并提出了以下研究问题:差异私人学习的性能何时不会随着模型大小的增加而降低?我们确定投影到子空间上的梯度的幅度是决定性能的关键因素。为了确切地为私人凸学习的特征,我们引入了一个条件,即我们将限制Lipschitz的连续性限制并得出了在其他条件下与维度无关的过多经验和人口风险的界限。我们从经验上表明,在大型语言模型的私人微调中,在本地最佳距离附近评估的梯度主要由一些主要组件控制。这种行为类似于我们在凸面设置中获得尺寸独立界限的条件。我们的理论和经验结果共同为大规模私人微调成功提供了可能的解释。
translated by 谷歌翻译
最近的工作证明了从生成语言模型中成功提取培训数据。但是,在文本分类模型中,这种提取是否可行,因为培训目标是预测类标签而不是下一字预测。这提出了一个有趣的挑战,并提出了关于文本分类设置中培训数据隐私的重要问题。因此,我们通过研究与学习任务无关的培训数据的意外记忆的问题来研究文本分类域中的潜在隐私泄漏。我们提出了一种算法,通过利用模型提供的类标签的可能性来提取部分文本的缺失令牌。我们通过将金丝雀插入训练集并试图在训练后提取令牌来测试算法的有效性。在我们的实验中,我们证明了在一定程度上可以成功提取。这也可以用作审计策略,以评估未经同意的任何未经授权使用个人数据的使用。
translated by 谷歌翻译
我们为大规模训练的大规模训练语言模型提供了更简单,更稀疏,更快的算法,这些算法在许多标准的NLP任务上实现了最新的隐私与实用性权衡。我们为此问题提出了一个元框架,这是受高度参数效率方法进行微调成功的启发。我们的实验表明,这些方法的差异化适应能力在三个重要方面优于以前的私人算法:实用程序,隐私以及私人培训的计算和记忆成本。在许多经常研究的数据集中,私人模型的实用性接近了非私人模型的方法。例如,在MNLI数据集上,我们使用Roberta-large的准确度为87.8 \%$,使用Roberta-Base $ 83.5 \%$,其隐私预算为$ \ Epsilon = 6.7 $。相比之下,缺乏隐私限制,罗伯塔·莱格(Roberta-Large)的准确度为$ 90.2 \%$。我们的发现对于自然语言生成任务类似。与DART,GPT-2-SMALL,GPT-2中,GPT-2-MEDIUM,GPT-2-LARGE和GPT-2-XL的私人微调达到38.5、42.0、43.1和43.8($ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 43.8) epsilon = 6.8,\ delta = $ 1E-5),而非私人基线为$ 48.1 $。我们所有的实验都表明,较大的模型更适合私人微调:虽然众所周知,它们旨在非优先实现卓越的准确性,但我们发现当引入隐私时,它们也更好地保持其准确性。
translated by 谷歌翻译
文本分割旨在将文本分为连续的语义连贯段,而段标签则与每个段的生成标签有关。过去的工作表明,在解决文档和对话的分段和标签方面取得了成功。通过特定于任务的管道,受监督和无监督的学习目标的结合,这是可能的。在这项工作中,我们提出了一个单一的编码器神经网络,该网络可以处理长文档和对话,同时仅使用标准监督进行细分和细分标记。我们成功地展示了将组合任务作为纯生成任务解决的方法,我们称之为结构化摘要。我们将相同的技术应用于文档和对话数据,并在高资产设置和低资源设置下显示了各个数据集的最新技术性能。我们的结果确定了一个有力的案例,可以考虑整体文本细分和细分标签,并朝着不依赖域专业知识或特定于任务的组件的通用技术迈进。
translated by 谷歌翻译
随着大型预训练的语言模型(例如GPT-2和BERT)的广泛可用性,最近的趋势是微调一个预训练的模型,以在下游任务上实现最新的性能。一个自然的示例是“智能回复”应用程序,其中调整了预训练的模型以为给定的查询消息提供建议的答复。由于这些模型通常是使用敏感数据(例如电子邮件或聊天成绩单)调整的,因此了解和减轻模型泄漏其调整数据的风险很重要。我们研究了典型的智能回复管道中的潜在信息泄漏漏洞,并引入了一种新型的主动提取攻击,该攻击利用包含敏感数据的文本中的规范模式。我们通过实验表明,对手可以提取培训数据中存在的敏感用户信息。我们探讨了潜在的缓解策略,并从经验上证明了差异隐私如何成为这种模式提取攻击的有效防御机制。
translated by 谷歌翻译
最近,已经开发了几种欢迎和利用不同类型的用户援助的图像分割方法。在这些方法中,可以通过在图像对象上绘制边界框,绘制涂鸦或种植种子来提供用户输入,这些种子有助于区分图像边界或通过交互地改进错误的图像区域。由于类型的类型和这些输入的量,不同分段方法的相对评估变得困难。作为可能的解决方案,我们提出了一种简单而有效的统计分割方法,可以处理和利用不同的输入类型和金额。所提出的方法基于鲁棒假设检测,特别是DGL测试,并且可以用时间复杂度来实现,该时间复杂度在图像区域的数量中的像素数和二次的线性。因此,适用于用于快速基准测试和评估不同类型的用户辅助分割算法的相对性能改进的基线方法。我们为建议方法的操作提供数学分析,讨论其能力和限制,提供验证其操作的设计准则和现有模拟。
translated by 谷歌翻译
Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric perspective that dominates these recent developments, in contrast to the mean-field perspective commonly found in other introductory texts.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译